Multi-TGDR: A Regularization Method for Multi-Class Classification in Microarray Experiments
نویسندگان
چکیده
BACKGROUND As microarray technology has become mature and popular, the selection and use of a small number of relevant genes for accurate classification of samples has arisen as a hot topic in the circles of biostatistics and bioinformatics. However, most of the developed algorithms lack the ability to handle multiple classes, arguably a common application. Here, we propose an extension to an existing regularization algorithm, called Threshold Gradient Descent Regularization (TGDR), to specifically tackle multi-class classification of microarray data. When there are several microarray experiments addressing the same/similar objectives, one option is to use a meta-analysis version of TGDR (Meta-TGDR), which considers the classification task as a combination of classifiers with the same structure/model while allowing the parameters to vary across studies. However, the original Meta-TGDR extension did not offer a solution to the prediction on independent samples. Here, we propose an explicit method to estimate the overall coefficients of the biomarkers selected by Meta-TGDR. This extension permits broader applicability and allows a comparison between the predictive performance of Meta-TGDR and TGDR using an independent testing set. RESULTS Using real-world applications, we demonstrated the proposed multi-TGDR framework works well and the number of selected genes is less than the sum of all individualized binary TGDRs. Additionally, Meta-TGDR and TGDR on the batch-effect adjusted pooled data approximately provided same results. By adding Bagging procedure in each application, the stability and good predictive performance are warranted. CONCLUSIONS Compared with Meta-TGDR, TGDR is less computing time intensive, and requires no samples of all classes in each study. On the adjusted data, it has approximate same predictive performance with Meta-TGDR. Thus, it is highly recommended.
منابع مشابه
Exploiting Associations between Class Labels in Multi-label Classification
Multi-label classification has many applications in the text categorization, biology and medical diagnosis, in which multiple class labels can be assigned to each training instance simultaneously. As it is often the case that there are relationships between the labels, extracting the existing relationships between the labels and taking advantage of them during the training or prediction phases ...
متن کاملFeature-based Malicious URL and Attack Type Detection Using Multi-class Classification
Nowadays, malicious URLs are the common threat to the businesses, social networks, net-banking etc. Existing approaches have focused on binary detection i.e. either the URL is malicious or benign. Very few literature is found which focused on the detection of malicious URLs and their attack types. Hence, it becomes necessary to know the attack type and adopt an effective countermeasure. This pa...
متن کاملNonparametric regression and classification with joint sparsity constraints
We propose new families of models and algorithms for high-dimensional nonparametric learning with joint sparsity constraints. Our approach is based on a regularization method that enforces common sparsity patterns across different function components in a nonparametric additive model. The algorithms employ a coordinate descent approach that is based on a functional soft-thresholding operator. T...
متن کاملMULTI CLASS BRAIN TUMOR CLASSIFICATION OF MRI IMAGES USING HYBRID STRUCTURE DESCRIPTOR AND FUZZY LOGIC BASED RBF KERNEL SVM
Medical Image segmentation is to partition the image into a set of regions that are visually obvious and consistent with respect to some properties such as gray level, texture or color. Brain tumor classification is an imperative and difficult task in cancer radiotherapy. The objective of this research is to examine the use of pattern classification methods for distinguishing different types of...
متن کاملFace Recognition using an Affine Sparse Coding approach
Sparse coding is an unsupervised method which learns a set of over-complete bases to represent data such as image and video. Sparse coding has increasing attraction for image classification applications in recent years. But in the cases where we have some similar images from different classes, such as face recognition applications, different images may be classified into the same class, and hen...
متن کامل